Correction of bias from non-random missing longitudinal data using auxiliary information.
نویسندگان
چکیده
Missing data are common in longitudinal studies due to drop-out, loss to follow-up, and death. Likelihood-based mixed effects models for longitudinal data give valid estimates when the data are missing at random (MAR). These assumptions, however, are not testable without further information. In some studies, there is additional information available in the form of an auxiliary variable known to be correlated with the missing outcome of interest. Availability of such auxiliary information provides us with an opportunity to test the MAR assumption. If the MAR assumption is violated, such information can be utilized to reduce or eliminate bias when the missing data process depends on the unobserved outcome through the auxiliary information. We compare two methods of utilizing the auxiliary information: joint modeling of the outcome of interest and the auxiliary variable, and multiple imputation (MI). Simulation studies are performed to examine the two methods. The likelihood-based joint modeling approach is consistent and most efficient when correctly specified. However, mis-specification of the joint distribution can lead to biased results. MI is slightly less efficient than a correct joint modeling approach and can also be biased when the imputation model is mis-specified, though it is more robust to mis-specification of the imputation distribution when all the variables affecting the missing data mechanism and the missing outcome are included in the imputation model. An example is presented from a dementia screening study.
منابع مشابه
A Non-Random Dropout Model for Analyzing Longitudinal Skew-Normal Response
In this paper, multivariate skew-normal distribution is em- ployed for analyzing an outcome based dropout model for repeated mea- surements with non-random dropout in skew regression data sets. A probit regression is considered as the conditional probability of an ob- servation to be missing given outcomes. A simulation study of using the proposed methodology and comparing it with a semi-parame...
متن کاملMultiple imputation using linked proxy outcome data resulted in important bias reduction and efficiency gains: a simulation study
Background When an outcome variable is missing not at random (MNAR: probability of missingness depends on outcome values), estimates of the effect of an exposure on this outcome are often biased. We investigated the extent of this bias and examined whether the bias can be reduced through incorporating proxy outcomes obtained through linkage to administrative data as auxiliary variables in multi...
متن کاملDepartment of Quantitative Social Science Reducing bias due to missing values of the response variable by joint modeling with an auxiliary variable
In this paper, we consider the problem of missing values of a continuous response variable that cannot be assumed to be missing at random. The example considered here is an analysis of pupil’s subjective engagement at school using longitudinal survey data, where the engagement score from wave 3 of the survey is missing due to a combination of attrition and item non-response. If less engaged stu...
متن کاملRunning head: SELECTION OF AUXILIARY VARIABLES 1 Selection of auxiliary variables in missing data problems: Not all auxiliary variables are created equal
The treatment of missing data in the social sciences has changed tremendously during the last decade. Modern missing data techniques such as multiple imputation and full-information maximum likelihood are used much more frequently. These methods assume that data are missing at random. One very common approach to increase the likelihood that missing at random is achieved, consists of including m...
متن کاملA Bayesian Approach to Estimate Parameters of a Random Coefficient Transition Binary Logistic Model with Non-monotone Missing Pattern and some Sensitivity Analyses
A transition binary logistic model with random coefficients is proposed to model the unemployment statues of household members in two seasons of spring and summer. Data correspond to the labor force survey performed by Statistical Center of Iran in 2006. This model is introduced to take into account two kinds of correlation in the data one due to the longitudinal nature o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Statistics in medicine
دوره 29 6 شماره
صفحات -
تاریخ انتشار 2010